A Parallel Data Management Layer for Data Mining

نویسندگان

  • Massimo Coppola
  • Paolo Pesciullesi
  • Luigi Presti
  • Roberto Ravazzolo
  • Marco Vanneschi
چکیده

We propose the design of a data management abstraction level to implement a full set of parallel KDD applications with minimal performance overhead and greater scalability than conventional DBMS, providing a high-level parallel API to be exploited by parallel and out-of-core data mining algorithms. We describe an existing prototype and report examples and first test results with mining algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-zero probability of nearest neighbor searching

Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...

متن کامل

Distributed data mining for e-business

In the internet-based e-business environment, most business data are distributed, heterogeneous and private. To achieve true business intelligence, mining large amounts of distributed data is necessary. Through a thorough literature review, this paper identifies four main issues in distributed data mining (DDM) systems for e-business and classifies modern DDM systems into three classes with rep...

متن کامل

A New Hybrid model of Multi-layer Perceptron Artificial Neural Network and Genetic Algorithms in Web Design Management Based on CMS

The size and complexity of websites have grown significantly during recent years. In line with this growth, the need to maintain most of the resources has been intensified. Content Management Systems (CMSs) are software that was presented in accordance with increased demands of users. With the advent of Content Management Systems, factors such as: domains, predesigned module’s development, grap...

متن کامل

Prediction-Based Portfolio Optimization Model for Iran’s Oil Dependent Stocks Using Data Mining Methods

This study applied a prediction-based portfolio optimization model to explore the results of portfolio predicament in the Tehran Stock Exchange. To this aim, first, the data mining approach was used to predict the petroleum products and chemical industry using clustering stock market data. Then, some effective factors, such as crude oil price, exchange rate, global interest rate, gold price, an...

متن کامل

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005